6 research outputs found

    FLAIR: Federated Learning Annotated Image Repository

    Full text link
    Cross-device federated learning is an emerging machine learning (ML) paradigm where a large population of devices collectively train an ML model while the data remains on the devices. This research field has a unique set of practical challenges, and to systematically make advances, new datasets curated to be compatible with this paradigm are needed. Existing federated learning benchmarks in the image domain do not accurately capture the scale and heterogeneity of many real-world use cases. We introduce FLAIR, a challenging large-scale annotated image dataset for multi-label classification suitable for federated learning. FLAIR has 429,078 images from 51,414 Flickr users and captures many of the intricacies typically encountered in federated learning, such as heterogeneous user data and a long-tailed label distribution. We implement multiple baselines in different learning setups for different tasks on this dataset. We believe FLAIR can serve as a challenging benchmark for advancing the state-of-the art in federated learning. Dataset access and the code for the benchmark are available at \url{https://github.com/apple/ml-flair}

    Training Large-Vocabulary Neural Language Models by Private Federated Learning for Resource-Constrained Devices

    Full text link
    Federated Learning (FL) is a technique to train models using data distributed across devices. Differential Privacy (DP) provides a formal privacy guarantee for sensitive data. Our goal is to train a large neural network language model (NNLM) on compute-constrained devices while preserving privacy using FL and DP. However, the DP-noise introduced to the model increases as the model size grows, which often prevents convergence. We propose Partial Embedding Updates (PEU), a novel technique to decrease noise by decreasing payload size. Furthermore, we adopt Low Rank Adaptation (LoRA) and Noise Contrastive Estimation (NCE) to reduce the memory demands of large models on compute-constrained devices. This combination of techniques makes it possible to train large-vocabulary language models while preserving accuracy and privacy

    Investigating the function of GroES with hard-to-fold proteins in vivo

    No full text
    The use of molecular chaperones can increase the yield of correctly folded proteins. This is especially needed in the expression of proteins non-native to the host organism. This study set out to investigate the function of the chaperone GroES; a component in the GroE-system. The function of this chaperone has only been studied alone in vitro. Here we lay ground to further studies on GroES and its ability to act alone in vivo. GroES was expressed from a plasmid and characterized through its potential to increase the amount of correctly folded proteins. Characterization was mainly done by fluorescence spectroscopy with hard-to-fold proteins linked to fluorescent probes. Results show a very clear increase in fluorescence for most of the substrate proteins tested, indicating that GroES has a significant role in the GroE-system and perhaps outside of it
    corecore